Fix AutoScheme low memory flag propagation from CLI by lvliang-intel · Pull Request #1596 · intel/auto-round

lvliang-intel · 2026-03-23T05:50:48Z

Description

This PR fixes inconsistent memory-mode behavior between the main AutoRound flow and AutoScheme when running from CLI.

low_gpu_mem_usage and low_cpu_mem_usage from CLI were passed to AutoRound but were not passed to AutoScheme.
This caused AutoScheme to run under a different memory strategy than the main quantization flow, which could lead to unexpectedly high CPU RAM usage and misleading memory behavior during AutoScheme generation.

CUDA_VISIBLE_DEVICES=6 python -m auto_round /mnt/disk2/lvl/Qwen3-1.7B/ --target_bits 2.5 --ignore_scale_zp_bits --options "gguf:q2_k_s,gguf:q4_k_s" --iters 1 --nsamples 16

Before fix:

After fix:

Type of Change

Related Issues

#1586

Checklist Before Submitting

My code has been tested locally.
Documentation has been updated as needed.
New or updated tests are included where applicable.

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

…ix_cli_low_cpu_mem

Copilot

Pull request overview

Fixes inconsistent memory-strategy behavior when AutoScheme is invoked via the CLI --avg_bits/--options path by ensuring the same low-memory flags used by the main AutoRound flow are applied to AutoScheme.

Changes:

Move low_cpu_mem_usage resolution earlier so it’s available before AutoScheme construction.
Pass low_gpu_mem_usage and low_cpu_mem_usage into the AutoScheme(...) constructor when --avg_bits is used.
Update the CLI help text for the deprecated --low_cpu_mem_usage flag.

auto_round/__main__.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

for more information, see https://pre-commit.ci

lvliang-intel added 2 commits March 23, 2026 13:35

Fix AutoScheme low memory flag propagation from CLI

6d70a38

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

Merge branch 'main' of https://github.com/intel/auto-round into lvl/f…

e52f970

…ix_cli_low_cpu_mem

Copilot AI review requested due to automatic review settings March 23, 2026 05:50

Copilot started reviewing on behalf of lvliang-intel March 23, 2026 05:54 View session

Copilot AI reviewed Mar 23, 2026

View reviewed changes

auto_round/__main__.py Outdated Show resolved Hide resolved

auto_round/__main__.py Show resolved Hide resolved

lvliang-intel and others added 2 commits March 23, 2026 14:20

Update auto_round/__main__.py

555d2a6

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

efa033f

for more information, see https://pre-commit.ci

chensuyue approved these changes Mar 27, 2026

View reviewed changes

chensuyue merged commit 11debae into main Mar 27, 2026
30 checks passed

chensuyue deleted the lvl/fix_cli_low_cpu_mem branch March 27, 2026 01:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix AutoScheme low memory flag propagation from CLI#1596

Fix AutoScheme low memory flag propagation from CLI#1596
chensuyue merged 4 commits intomainfrom
lvl/fix_cli_low_cpu_mem

lvliang-intel commented Mar 23, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lvliang-intel commented Mar 23, 2026

Description

Type of Change

Related Issues

Checklist Before Submitting

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants